emotion and intention
Emotion and Intention Guided Multi-Modal Learning for Sticker Response Selection
Hu, Yuxuan, Chen, Jian, Wang, Yuhao, Li, Zixuan, Xiong, Jing, Jia, Pengyue, Wang, Wei, Li, Chengming, Zhao, Xiangyu
Stickers are widely used in online communication to convey emotions and implicit intentions. The Sticker Response Selection (SRS) task aims to select the most contextually appropriate sticker based on the dialogue. However, existing methods typically rely on semantic matching and model emotional and intentional cues separately, which can lead to mismatches when emotions and intentions are misaligned. To address this issue, we propose Emotion and Intention Guided Multi-Modal Learning (EIGML). This framework is the first to jointly model emotion and intention, effectively reducing the bias caused by isolated modeling and significantly improving selection accuracy. Specifically, we introduce Dual-Level Contrastive Framework to perform both intra-modality and inter-modality alignment, ensuring consistent representation of emotional and intentional features within and across modalities. In addition, we design an Intention-Emotion Guided Multi-Modal Fusion module that integrates emotional and intentional information progressively through three components: Emotion-Guided Intention Knowledge Selection, Intention-Emotion Guided Attention Fusion, and Similarity-Adjusted Matching Mechanism. This design injects rich, effective information into the model and enables a deeper understanding of the dialogue, ultimately enhancing sticker selection performance. Experimental results on two public SRS datasets show that EIGML consistently outperforms state-of-the-art baselines, achieving higher accuracy and a better understanding of emotional and intentional features. Code is provided in the supplementary materials.
Fitting Different Interactive Information: Joint Classification of Emotion and Intention
Li, Xinger, Zhong, Zhiqiang, Huang, Bo, Yang, Yang
This paper is the first-place solution for ICASSP MEIJU@2025 Track I, which focuses on low-resource multimodal emotion and intention recognition. How to effectively utilize a large amount of unlabeled data, while ensuring the mutual promotion of different difficulty levels tasks in the interaction stage, these two points become the key to the competition. In this paper, pseudo-label labeling is carried out on the model trained with labeled data, and samples with high confidence and their labels are selected to alleviate the problem of low resources. At the same time, the characteristic of easy represented ability of intention recognition found in the experiment is used to make mutually promote with emotion recognition under different attention heads, and higher performance of intention recognition is achieved through fusion. Finally, under the refined processing data, we achieve the score of 0.5532 in the Test set, and win the championship of the track.
Cinema, AI, and Human Nature โ Film School Rejects
Open any newspaper and you'll find a profusion of articles and op-eds debating the future of artificial intelligence. Elon Musk is terrified of it. The AI mania has even permeated the film world, which (the latest slew of "film is dead" articles warns us) will apparently not escape the automation boom. Of course, our public conversation about AI has long been tied to the cinema. As our own Sinead McCausland has pointed out, films have supplied the popular imagination with images and existential questions about AI since Fritz Lang's Metropolis.